AITopics | batch normalisation

Collaborating Authors

batch normalisation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Practical Deep Learning with Bayesian Principles

Kazuki Osawa, Siddharth Swaroop, Mohammad Emtiyaz E. Khan, Anirudh Jain, Runa Eschenhagen, Richard E. Turner, Rio Yokota

Neural Information Processing SystemsAug-20-2025, 00:04:58 GMT

Neural Information Processing Systems http://nips.cc/

latexit sha1, learning, vogn, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(5 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Add feedback

Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning

Chen, Hongyao, Xu, Tianyang, Wu, Xiaojun, Kittler, Josef

arXiv.org Artificial IntelligenceMay-29-2025

Batch Normalisation (BN) is widely used in conventional deep neural network training to harmonise the input-output distributions for each batch of data. However, federated learning, a distributed learning paradigm, faces the challenge of dealing with non-independent and identically distributed data among the client nodes. Due to the lack of a coherent methodology for updating BN statistical parameters, standard BN degrades the federated learning performance. To this end, it is urgent to explore an alternative normalisation solution for federated learning. In this work, we resolve the dilemma of the BN layer in federated learning by developing a customised normalisation approach, Hybrid Batch Normalisation (HBN). HBN separates the update of statistical parameters (i.e. , means and variances used for evaluation) from that of learnable parameters (i.e. , parameters that require gradient updates), obtaining unbiased estimates of global statistical parameters in distributed scenarios. In contrast with the existing solutions, we emphasise the supportive power of global statistics for federated learning. The HBN layer introduces a learnable hybrid distribution factor, allowing each computing node to adaptively mix the statistical parameters of the current batch with the global statistics. Our HBN can serve as a powerful plugin to advance federated learning performance. It reflects promising merits across a wide range of federated learning settings, especially for small batch sizes and heterogeneous data.

artificial intelligence, batch normalisation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.21877

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Reviews: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsJan-20-2025, 22:12:06 GMT

The suggested reparametrisation and its theoretical analysis are very interesting and I enjoyed reading the paper. However, some points in the theoretical analysis could be improved: The paper argues that the new parametrisation improves the conditioning matrix of the gradient, but neither a strong theoretical argument nor a empirical demonstration for this are given. In line 127 it is said "Empirically, we find that w is often (close to) a dominant eigenvector of the covariance matrix C", but the correspond experiments are neither shown in the paper nor in the supplemental material. In line 122/123 the authors claim "It has been observed that neural networks with batch normalization also have this property (to be relatively insensitive to different learning rates), which can be explained by this analysis.". However, it did not became clear to me, how the analysis of the previous sections can be directly transferred to batch normalisation.

gradient, neural network, simple reparameterization, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Evaluation of Data Augmentation and Loss Functions in Semantic Image Segmentation for Drilling Tool Wear Detection

Schlager, Elke, Windisch, Andreas, Hanna, Lukas, Klünsner, Thomas, Hagendorfer, Elias Jan, Teppernegg, Tamara

arXiv.org Artificial IntelligenceFeb-10-2023

Tool wear monitoring is crucial for quality control and cost reduction in manufacturing processes, of which drilling applications are one example. In this paper, we present a U-Net based semantic image segmentation pipeline, deployed on microscopy images of cutting inserts, for the purpose of wear detection. The wear area is differentiated in two different types, resulting in a multiclass classification problem. Joining the two wear types in one general wear class, on the other hand, allows the problem to be formulated as a binary classification task. Apart from the comparison of the binary and multiclass problem, also different loss functions, i. e., Cross Entropy, Focal Cross Entropy, and a loss based on the Intersection over Union (IoU), are investigated. Furthermore, models are trained on image tiles of different sizes, and augmentation techniques of varying intensities are deployed. We find, that the best performing models are binary models, trained on data with moderate augmentation and an IoU-based loss function.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.05262

Country: Europe > Austria (0.94)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Feedback-Gated Rectified Linear Units

Kemmerling, Marco

arXiv.org Artificial IntelligenceJan-6-2023

Feedback connections play a prominent role in the human brain but have not received much attention in artificial neural network research. Here, a biologically inspired feedback mechanism which gates rectified linear units is proposed. On the MNIST dataset, autoencoders with feedback show faster convergence, better performance, and more robustness to noise compared to their counterparts without feedback. Some benefits, although less pronounced and less consistent, can be observed when networks with feedback are applied on the CIFAR-10 dataset.

artificial intelligence, autoencoder, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2301.0261

Country:

Europe > Netherlands > Limburg > Maastricht (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Entangled q-Convolutional Neural Nets

Anagiannis, Vassilis, Cheng, Miranda C. N.

arXiv.org Machine LearningMar-5-2021

We introduce a machine learning model, the q-CNN model, sharing key features with convolutional neural networks and admitting a tensor network description. As examples, we apply q-CNN to the MNIST and Fashion MNIST classification tasks. We explain how the network associates a quantum state to each classification label, and study the entanglement structure of these network states. In both our experiments on the MNIST and Fashion-MNIST datasets, we observe a distinct increase in both the left/right as well as the up/down bipartition entanglement entropy during training as the network learns the fine features of the data. More generally, we observe a universal negative correlation between the value of the entanglement entropy and the value of the cost function, suggesting that the network needs to learn the entanglement structure in order the perform the task accurately. This supports the possibility of exploiting the entanglement structure as a guide to design the machine learning algorithm suitable for given tasks.

architecture, arxiv preprint arxiv, entanglement entropy, (13 more...)

arXiv.org Machine Learning

2103.11785

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

On Batch Normalisation for Approximate Bayesian Inference

Mukhoti, Jishnu, Dokania, Puneet K., Torr, Philip H. S., Gal, Yarin

arXiv.org Machine LearningDec-24-2020

We study batch normalisation in the context of variational inference methods in Bayesian neural networks, such as mean-field or MC Dropout. We show that batch-normalisation does not affect the optimum of the evidence lower bound (ELBO). Furthermore, we study the Monte Carlo Batch Normalisation (MCBN) algorithm, proposed as an approximate inference technique parallel to MC Dropout, and show that for larger batch sizes, MCBN fails to capture epistemic uncertainty. Finally, we provide insights into what is required to fix this failure, namely having to view the mini-batch size as a variational parameter in MCBN. We comment on the asymptotics of the ELBO with respect to this variational parameter, showing that as dataset size increases towards infinity, the batch-size must increase towards infinity as well for MCBN to be a valid approximate inference technique.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

2012.1322

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Batch Norm Patent Granted To Google: Is AI Ownership The Gold Rush Of 21st Century?

#artificialintelligenceOct-17-2019, 09:33:53 GMT

The machine learning community has witnessed a surge in releases of frameworks, libraries and software. Tech pioneers like Google, Amazon, Microsoft and others have insisted their intention behind open-sourcing their technology. However, there has been a growing trend of these tech giants claiming ownership for their innovations. According to the National Bureau of Economic Research study, in 2010, there were 145 US patent filings that mentioned machine learning, compared to 594 in 2016. Google, especially, has filed patents related to machine learning and neural networks 99 times in 2016 alone.

batch norm patent granted, google, ownership, (8 more...)

#artificialintelligence

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ChronoMID - Cross-Modal Neural Networks for 3-D Temporal Medical Imaging Data

Rakowski, Alexander G., Veličković, Petar, Dall'Ara, Enrico, Liò, Pietro

arXiv.org Machine LearningJan-12-2019

ChronoMID builds on the success of cross-modal convolutional neural networks (X-CNNs), making the novel application of the technique to medical imaging data. Specifically, this paper presents and compares alternative approaches - timestamps and difference images - to incorporate temporal information for the classification of bone disease in mice, applied to micro-CT scans of mouse tibiae. Whilst much previous work on diseases and disease classification has been based on mathematical models incorporating domain expertise and the explicit encoding of assumptions, the approaches given here utilise the growing availability of computing resources to analyse large datasets and uncover subtle patterns in both space and time. After training on a balanced set of over 75000 images, all models incorporating temporal features outperformed a state-of-the-art CNN baseline on an unseen, balanced validation set comprising over 20000 images. The top-performing model achieved 99.54% accuracy, compared to 73.02% for the CNN baseline.

difference image, information, timestamp, (16 more...)

arXiv.org Machine Learning

1901.03906

Country: Europe > Italy > Sardinia (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Deep Learning for Audio Transcription on Low-Resource Datasets

Morfi, Veronica, Stowell, Dan

arXiv.org Machine LearningJul-11-2018

In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

1807.03697

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback